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Water resource and environmental engineers need accurate 
information in harnessing water for diverse uses, therefore it is 
expedient to accurately predict dry and wet climatic phases in 
order to ensure optimum water resource planning and 
management. This study examined the applicability of machine 
learning models for the prediction of extreme dry and wet 
conditions in Minna, North Central Nigeria. Recorded rainfall, 
maximum temperature, minimum temperature, relative 
humidity, wind speed, sunshine hours and estimated potential 
evapotranspiration were used as predictors in the machine 
learning models, while hydrological extremes estimated from 
standardized precipitation index (SPI) served as a_ response 
variable. The performance of Support Vector Machine (SVM) 
based on different kernel types and Artificial Neural Network 
(ANN) based on different network structures were assessed for 
the prediction of the different phases of the climate of the study 
area. The study showed that while normal meteorological 
conditions occurred for about 74.8% of the study period, 8.9%, 
4.6% and 1.9% of this period were moderately wet, severely 
wet and extremely wet respectively, and 4.7%, 3.4% and 1.7% 
of the study period were moderately dry, severely dry and 
extremely dry respectively. Furthermore, SVM based on Radial 
Basis Kernel with a coefficient of determination of 0.64 
outperformed other SVM types and ANN _ with two _ hidden 
layers; with of coefficient of determination 0.68 was found to 
perform better than ANN with single layers. Generally, ANN 
was found to have higher accuracy than SVM in predicting dry 
and wet climatic phases in North Central Nigeria. 
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1. Introduction 


Water is necessary for agricultural, domiciliary, industrial, leisurely, and ecological purposes [1]. 
Its demand has already surpassed its supply in many parts of the world, and many more areas are 
anticipated to face this disparity in the near future [2]. Water resources management is of direct 
interest to all individuals [3]. Since the use of water is a part of everyday living. Such extensive 
concern in water is not exceptional, as often claimed by many professionals, because of its roles 
in other essential sectors such as transportation, environment, health, and energy [4]. 


The ability to predict dry and wet climatic phases is immensely important in order to moderate 
the effects of extreme events on surface water, groundwater and water resources management 
projects such as hydro-electric power plants, irrigation structures, flood control structures, and 
others. Droughts impacted 50 percent of the 2,8 billion people who suffered natural disasters 
between 1967 and 1991, and 3,5 million people who lost their lives due to natural disasters 35 
percent were as a result of droughts [5]. More than 50 percent of the most populous areas in the 
world are susceptible to drought. Droughts trigger a compound set of effects that range across 
several parts of the economy and influences well beyond the physical space affected by them 
(Mishra and Desai, 2006); while on the other hand, flooding has caused tremendous losses to 
properties and sometimes life [6]. 


In the face of disputes between stakeholders and important decisions making as regards to water 
use, managers commonly rely on decision support tools such as models [7]. In the case of 
extremes, decision-support tools usually focus on their modeling, forecasting, and management. 
Hydrological issues such as drought and flood could be avoided if appropriate and accurate 
forecasting tools are in place [6]. 


Drought occurs when hydrological variables such as rainfall and streamflow fall below the barest 
minimum value; while flood occurs when rainfall and streamflow occur above the barest 
maximum value. While drought usually creates the impression of water shortage due to 
inadequate rainfall, excessive evapotranspiration, and overuse of water resources or a mixture of 
these variables [8], flood gives an impression of the direct opposite. The principal cause of a 
drought is the absence of precipitation over a large spatial extent for an extensive period of time; 
known as meteorological drought [9] While the primary cause of flood is too much available 
water than the draining capacity of an area over a period of time. 


Extreme events affect nearly all regions of the world and lead to weighty economic, social, and 
environmental impacts [10], there optimization of informed decision making toward their risk 
reduction cannot be overemphasized. The decision-making procedure commences by coalescing 
and unifying data into pieces of information that are then modeled to create insights, which form 
the foundation on which decisions are made [10]. 


The ability to predict extreme weather events is immensely important in order to mitigate their 
effects [5] and since water resources and environmental engineers need accurate information in 
harnessing water for diverse uses. There is a need to emphasize the use of non-linear models that 
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could represent the complex water resources system, with minimum error for preparation and 
mitigation. 


Past studies have highlighted some problems with water resources in Nigeria [11]. The country is 
labeled as being water-short and is likely to have water declined from 2,506 cubic meters per 
year in 1995 to 1,175 in 2025, if not managed properly [12]. By 2025, the Food and Agriculture 
Organization (FAO), estimated that about 1.8 billion persons will be residing in areas with 
complete scarcity of water, and about two-thirds of the world’s population could be in water 
stress conditions. The current world population is about 7 billion people, with the potential of 
rising to 9 billion in the next 40 years [12]. Ayanshola et al. [11] identified climate variability as 
one of twenty-two (22) challenges of water resources development and management in the North 
Central zone of Nigeria. 


One of the widely used methods for meteorological extremes assessment is the Standardized 
Precipitation Index (SPI) established by McKee et al. [13]. This method has the ability to 
compute the deficit of rainfall for multiple temporal scales while reflecting its impression on the 
obtainability of several water supplies. This resourcefulness enables the adaptability of SPI in 
monitoring long and short-range supplies of water. 


Short term extremes are hard to diagnose, hence to assess them, it is expedient to use short time 
scales, for instance, drought decisions could be extensively analyzed when dealt with in short 
scales as events missed on a larger scale are caught in short ones. This study, therefore, seeks to 
model and predict SPI based on the one-month time scale using SVM and ANN, this will serve 
as decision support in water resources management in the study area. 


2. Description of the study area 


The study lies at latitudes 9°37'N- 9°79'N and longitude 6°16'E - 6°65'E. Minna is the capital of 
Niger state, and one of the major growing states of North Central Nigeria [14]. Minna is situated 
about 150 kilometers from the capital of the Federal Republic of Nigeria. The state has a 
population of about 506,113 persons with an average population density of about 3448 persons 
per km* [15]. The population growth in Minna claimed to be greater than the average of the 
whole nation as a result of its closeness to Abuja, which is the capital of the Federal Republic of 
Nigeria [14]. 


The geologic formation of the study area is based on the undistinguishable basement complex of 
mainly gneiss and magnetite [16]. The climate of the study area lies within a region described as 
a tropical climate [16]. It has a tropical dry and wet climate. The region is characterized by 
double rainfall maxima. The study area has mean annual precipitation of 1300 mm [16]. The 
rainy season commences most of the time in April and lasts till October, with fluctuations in the 
amount of rainfall received per year. The highest mean monthly rainfall occurs in September 
with almost 300mm. Temperature is uniformly high throughout the year reaching the peaks of 
40°C (Feb./March) and 30°C (Nov./Dec.) [16]. 
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Fig. 1. Map of the Study Area. 
3. Methodology 


3.1. Materials 


Hydro-meteorological data for Minna rainfall station was obtained from the Nigeria 
Meteorological Agency (NIMET), which includes rainfall, maximum temperature, minimum 
temperature, sunshine hour, relative humidity, pan evaporation, and wind speed. The time series 
of the data covered55 years covering 1961 to 2015. 


3.2. Modified penman-monteith’s method 


The Penman-Monteith method, which was modified by the Food and Agricultural Organization 
(FAO) in 1963was adopted for estimating reference evapotranspiration in the study area. The 
Modified Penman-Monteith method according to Doorenbos and Pruitt (1977)[17] and Jennifer, 
(2001)[18] can be mathematically expressed as follows: 


ET, = ~ lw Bo = +(1-W)f(uj(e, — eq) (1) 


where ET, is the reference evapotranspiration, mm/day; W is a temperature and the altitude of the 
area -related weighting factor; Ry is the net solar radiation, MJm7d"; G is the soil heat flux in 
MJm°d"';A is the latent heat of evaporation, MJ kg'; py = 1000 kgm”, is the density of water; 
f(u) is the wind-related function, kg hPa!m?d! ; €, 1s the saturation vapor pressure at mean air 
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temperature, hPa; eg is the mean actual vapour pressure of the air, hPa; and C is an adjustment 
factor to account for day and night weather conditions. 


3.3. Standardized precipitation index (SPI) 


The generally adapted distribution for SPI estimation is the two-parameter gamma distribution 
that has shape and scale parameters, and is expressed by the probability density function: 


= 1 x .0-1,-0/6 
G(x) = rest x?-te-9/°dx forx >0 (2) 
where o is the shape parameter, 5 is the scale parameter, x is the rainfall series and T(@) is the 
gamma function. The gamma distribution is indefinite for x = 0, but the rainfall could have zero 
value, consequently, the derived cumulative probability distribution (CPD) for a zero value is: 


H(x) = q + (1-q) G(x) (3) 


Where q is the probability of the zero-rainfall value. The CPD is then changed into the standard 
normal distribution to compute SPI. The value of SPI specifies the potency of the irregularity in 
climatic phases, while the classification of dry/wet intensity as presented by Mckee et al. (1993) 
[13]can be found in table 1. In this study, a one-month time scale SPI was adopted. 


Table 1 
Dry and Wet Climatic Phases based on SPI values. 
Category SPI values 
Extremely wet > 2.0 
Severely wet 1.50 to 1.99 
Moderate wet 1.00 to 1.49 
Near normal 0.99 to -0.99 
Moderately dry -1.00 to -1.49 
Severely dry -1.50 to -1.99 
Extremely dry <-2.0 


3.4. Machine learning models development 


In the present study, three major steps were used for the development of non-linear models for 
predicting SPI. These steps include firstly the selection of input and target data for calibration 
and validation. Rainfall, estimated evapotranspiration, maximum and minimum temperature, 
relative humidity, evaporation, sunshine duration, and wind speed served as input, while 
computed one-month SPI served as output. A total of 660 months was used, where 75% was 
randomly selected as a training set, and the remaining 25% was used for the validation of 
models. Secondly, the model structure was selected, and parameters were estimated, while the 
third step involved the validation of the selected model and evaluation of the performance of 
models. 
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3.5. Artificial neural network (ANN) 


According to Janaki (2013) [19], “an ANN is a non-linear mathematical model that has the 
ability to replicate detected properties of neuron systems and draws the analogies of adaptive 
biological learning system by using the learning rules and hybrid algorithm”. 


A network may vary from single to multiple layers, but a network’s basic structure usually 
consists of three layers where data is given to the ANN network, the data is processed in hidden 
layers and the results of the input layer are generated in the output layer [19] as shown in figure 
2. The backpropagation neural network with several numbers of neurons and hidden layers was 
adopted in this study. 


HIDDEN LAYER(S) 


INPUT LAYER 


Fig. 2. A typical Multilayered Neural Network Structure (Adapted from Saracoglu, 2008) [20]. 


3.6. Support vector machine (SVM) 


SVM is a machine learning technique that has been wildly used for the prediction of variables 
that involves non-linear relationships. It was introduced by Vapnik and Cortes in 1995. 
According to Cover’s theorem (1965), “a linear function f() can be formulated in the high 
dimensional feature space to represent a non-linear relation between the inputs (xj) and the 
output (y;)”, this is further expatiated by [21]. This linear function is presented as follows: 


yi= f(xi)=(w,o(xi))+b (4) 


where w and b are model parameters. 
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The type of SVM adopted for the present study is used for the regression problem, which is 
recognized as Support Vector Regression (SVR). According to Banafsheh and Mohsen (2011) 
[21], SVR finds an optimized solution to minimize equation 4. 


a(w, &, 8 =SwTw + C.- Ding (+a %) (5) 


Subject to ((w,Xi)+b)-yi < € + &; 


* 


yi- ((W,Xi)+b) <j + &; 
ej >0 
wex 6 ER beER dolls 


Where: 

L: number of data points in the training dataset 
C: model parameter 

Xi: feature space data points 

Ww: optimization problem solution 

é;: model residuals (¢; =y;-f(x1 )) 


¢, and ¢; “are positive slack variables and C is a positive real-valued and pre-specified constant” 
[21]. Zahraie et al. (2014) further posited that “the constant C determines the amount up to which 
deviations from ¢€ are tolerated; while deviations above ¢ are denoted by ¢;. whereas deviations 
below « are denoted by ¢;*. C which is always positive is the penalty parameter on the training 
error. Practically, selection of the kernel function K(x;, xj) = o(x;)'. o(xj;) is enough for training the 
SVM”. The Kernel functions and their parameters, suggested by the previous researchers are 
displayed in table 2. A detailed explanation of SVM could be found in Chen (2015)[22]; Granata 
et al. (2016) [23] and Du et al. (2017)[24]. 


Table 2 
Kernel functions and Parameters (Zahraie et al.,2014). 
Kernel type Kernel Function 
Linear K(x, Xj) = (%iX)) 
Polynomial K(x,xj) = ((%:,x)) + C)? 
Sigmoid K(Xi,xj) = tan(b(Xi,x;) — c) 


Radial Basis Function (RBF) —K(x;,x;) = exp(-||Xi,X;| 'Y) 
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4. Results 


4.1. Descriptive analysis of hydro metrological variables 


The descriptive statistics of the hydro meteorological variables used in this study is represented 
by the box plots presented in figure 2 to figure 9. Figure 2 shows that rainfall was generally low 
between November and March (dry months) and substantial between April and October (wet 
months). As depicted in figures 3 and 4, the maximum temperature was generally experienced to 
be low in wet months and high in dry wet months, while the minimum temperature was lowest in 
the dry months. Figure 5 and 6 shows the distribution of relative humidity and sunshine duration, 
it could be observed that relative humidity was generally high in the wet months when sunshine 
duration was low due to high cloud cover and was found to be low in dry months when sunshine 
duration was relatively high due to low cloud cover and atmospheric water content. More so, it 
could be observed that wind speed does not follow the pattern of other hydro meteorological 
variables, however, its highest value was recorded in the month of May and lowest value in the 
month of January (figure 7). The recorded evaporation and the computed potential 
evapotranspiration followed similar patterns, with the lowest values in wet months (figure 8 and 
9). This implies that more water is generally lost to the atmosphere by plants and open water 
bodies during the dry months in the study area. 
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Fig. 2. Box Plot of Monthly Rainfall Distribution. 
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Fig. 3. Box Plot of Monthly Maximum Temperature Distribution. 
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Fig. 4. Box Plot of Monthly Minimum Temperature Distribution. 
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Fig. 5. Box Plot of Monthly Relative Humidity Distribution. 
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Fig. 6. Box Plot of Monthly Sunshine Hour Distribution. 
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Fig. 7. Box Plot of Monthly Wind Speed Distribution. 
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Fig. 8. Box Plot of Monthly Evaporation Distribution. 
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Fig. 9. Box Plot of Monthly Maximum Potential Evapotranspiration Distribution. 


4.2. Analysis of standardized precipitation index 


Table 3 shows the frequency of occurrence of the SPI category during the study period. The table 
shows that the majority of the months in the study period were near normal SPI (74.8%), while 
the remaining 25.2% of the study period were moderate to extreme conditions. As presented in 
table 3, only thirteen and eleven months experienced extreme wet and dry conditions 
respectively throughout the study period, indicating that extreme conditions are scare and could 
cause havoc to water resource systems in the study area if not monitored. 


Table 3 
Frequency of Standardized Precipitation Index (SPI) Class (1961 -2015). 
Category SPI values Frequency (months) |% Occurrence 
Extremely wet >2.0 13 1.9 
Severely wet 1.50 to 1.99 30 4.6 
Moderate wet 1.00 to 1.49 58 8.9 
Near normal 0.99 to -0.99 490 74.8 
Moderately dry -1.00 to -1.49 eal 4.7 
Severely dry -1.50 to -1.99 22 3.4 


Extremely dry < -2.0 11 1.7 
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Figure 10 shows the variability of computed one-month SPI of the study area, it could be 
observed that extreme wet conditions were experienced in 1962, 1966, 1983, 1986, 1988, 1991, 
2003, 2007, 2010, 2014 and 2015, while extreme dry conditions were experienced in 1976, 1985, 
1987, 1989, 1992, 2002, 2004, 2008, 2011, 2012. 
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Fig. 10. SPI Values Based on a One-month Time Scale. 


4.3. Mapping of hydro-meteorological variables to SPI using machine learning models 


Eight hydro-meteorological variables that include rainfall, maximum temperature, minimum 
temperature, relative humidity, wind speed, sunshine duration, evaporation, and potential 
evapotranspiration were considered for simulation in the SVM and ANN model since these 
hydro-meteorological indices influence water resources management decisions. 


After a series of trials and errors in the training and testing of multiple ANN models with 
different hidden layers, a most suitable model with the lowest mean square error was selected 
(figure 11). The ANN model consists of two hidden layers with eight and four nodes in each 
hidden layer (figure 11) was selected due to its ability to perform well with test data. Table 4 
shows the performance of the models, by comparing the predicted SPI using the machine 
learning models and the computed SPI. It could be observed in the table that SVM with radial 
kernel performed better in predicting the SPI values of the training set than SVM with other 
kernels and ANN because of its lower mean square error (MSE) and root mean square error 
(RMSE) of 0.2151 and 0.4638 respectively; and high correlation and coefficient of determination 
(0.8563 and 0.7333 respectively). While ANN with two hidden layers performed best in 
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predicting the testing test than other models since it has the lowest MSE and RMSE of 0.3221 
and 0.5675 respectively, and correlation and 0.8315 and coefficient of determination of 0.6914. 


The fact that predicted and the computed SPI has a high collinear relationship (correlation is 
greater than 0.8) in both models annul the claim of superiority of one model over the other (table 
4) in predicting dry and wet climate phases on the study area. Figure 12 and figure 13 show the 
relationship between the computed climate phases and those predicted by SVM with radial 
kernel and ANN with two hidden layers respectively. Both models were able to explain more 
than 60% of the computed SPI, whether for calibration or validations. 


Table 4 
Performance Results of Machine Learning Algorithms for Dry and Wet Climate Phases. 
Model Type Training Testing 
R R? MSE RMSE R R? = MSE RMSE 


SVM-Radial 0.8563 0.7333 0.2151 0.4638 0.8014 0.6422 0.3970 0.6300 
SVM-Linear 0.7125 0.5076 0.3849 0.6204 0.7331 0.5375 0.4795 0.6925 
SVM-Sigmol = -0.1407 0.0198 24.1201 4.9112 -0.1318 0.0174 34.8234 5.9011 
SVM-Polynomial 0.7003 0.4904 0.4057 0.6370 0.6005 0.3606 0.6549 0.8092 
ANN (8-4-1) 0.7081 0.5015 1.5193 1.2326 0.8074 0.6519 0.3564 0.5970 
ANN (8-8-4-1) 0.7769 0.6036 =1.0317. 1.0157 0.8315 0.6914 0.3221 0.5675 
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Fig. 11. Graphical Representation of ANN Model (Authors’ Experiment). 
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Fig. 12. Performance of SVM in Predicting Climate Phases based on Calibration (Right) and Validation 
(Left). 
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Fig. 13. Performance of ANN in Predicting Climate Phases based on Calibration (Right) and Validation 
(Left). 


5. Discussion 


The suitability of extreme events prediction based on non-linear models is presented in this 
study. A time series of the data of 55 years covering 1961 to 2015, which constitutes 660 months 
was employed. Descriptive statistics of the hydro-meteorological variables reveals that rainfall 
was generally low between November and March (dry months) and substantial between April 
and October (wet months) when the temperature is lower due to due cloud cover, high 
atmospheric water content and low water loss from evaporation and transpiration. This 
corroborates with the findings of Olayemi et al. (2014), that the magnitude of the climatic 
variables, recorded in the area, falls squarely within ranges reported for the Tropics as a whole. 


In many water resources engineering applications, linear empirical modeling has been identified 
to have many short comes. This is because these linear empirical models are established to 
predict future events that are stochastic in nature, and since they are developed from observed 
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data representing past events, the models become less reliable. In many instances, the observed 
data are finite and non-uniform forming only a sparse distribution in the input space (Yonas et 
al., 2000). Linear empirical models have been confirmed to offer acceptable predictive 
performance when dealing with linear or close to linear, but are usually unable to properly 
explain nonlinear patterns that are hidden in hydro-meteorological data. The need to develop 
reliable models for predicting water resources variables has aroused the interest of researchers in 
machine learning models, which was why the present study endeavors to assess the performance 
of ANN of different structures and SVMs of difference kernels in predicting dry and wet climate 
phases of the study area. 


Machine learning models have taken the forefront of soft computing in field of hydrology and 
civil engineering; Sahai et al. (2000)[25] employed ANN in the prediction of seasonal rainfall in 
India, Toth et al. (2000)[26] combined ARMA model, ANN, and KNN to approximate runoff 
using rainfall, and discovered that ARMA outperformed other models, in contrast, Somvanshi et 
al. (2006) [27] noted that ANN outperformed ARIMA, and not ARMA for predicting rainfall 
series, while Karamouz et al. (2009)[28] compared ANN and Statistical Down-Scaling Model 
(SDSM) for predicting rainfall and concluded that the SDSM performance is better, although, is 
found to more data exhaustive model than ANN. Khalili et al. (2011) [29] got satisfactory results 
after applying ANN for rainfall prediction in Iran, while Geetha and Selvaraj (2011)[30] 
emphasized the inability of ANN to predict sharp peaks of monthly rainfall. More recently, 
Belayneh et al (2016) [31] emphasized that wavelet neural network was able to predict SPI more 
accurately than SVM and ANN, 


Recently, due to Support Vector Machine’s (SVM’s) admirable features of being robust [24], 
water resources and environmental engineers have considered the SVM method for the modeling 
of water resources systems. For instance, Lu and Wang (2011)[32], Nayak and Ghosh 
(2013)[33], Ortiz-Garcia (2014)[34], Sanchez-Monedero (2014)[35], Jinglin et al. (2017) [24], 
Sehad (2016)[36] applied SVM in rainfall studies; while, Young (2017)[37] integrated 
physically-based models such as Hydrologic Modeling System (HEC-HMS) and data-driven 
(Support vector regression, SVR) models for the prediction of runoff in Taiwan. 


In the present study it was unraveled that of all the kernel types considered for the SVM model, 
SVR with SVM with Radial kernel (R?=0.64, RMSE=0.63) outperformed the SVM with linear, 
signal and polynomial kernels, while ANN with two hidden layers (R7=0.68, RMSE=0.57) 
outperformed ANN model with a single layer (R°=0.66, RMSE=0.64) in predicting dry and wet 
climatic phases. The inability of the models to perform more than 68% could be as a result of 
overfitting in ANN, and sensitivity to parameter section [38], which major shortcomings of the 
data-driven model. More-so, the hydrology system is influenced by many factors such as 
weather, land cover, infiltration, evapotranspiration, so it includes a good deal of stochastic 
dependent component, multi-time scale, and highly non-linear characteristics. Apart from the 
challenges of overfitting and parameter selection that could reduce the performance of the 
models, the models could be affected by hydrologic data signals that are usually nonstationary 
and their seasonal irregularity [39]. Despite this shortcomings, the models were found to perform 
better than those presented by Aiyelokun et al. (2017)[40], who used ANN to predict SPI for 
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drought investigation in part of the Tropics and unraveled that the ANN models had RMSE of 
2.126 to 0.971, although the predictors were limited to just rainfall, mean temperature, relative 
humidity and evapotranspiration; in contrast to the present study that used eight predictors, 
which implies that the characteristics and number of input variables impact the performance of 
machine learning models. 


The study further revealed that the performance of ANN and Radial kernel-based SVM were 
close to each other, this was found to be similar to the result obtained in Belayneh and 
Adamowski (2013), who stated that ANN and SVR had similar performance when predicting SPI 
of different time scales. 


In this study, both SVM and ANN have been applied for the prediction of extremes in Minna, 
North Central Nigeria. However, based on the scope of this study, both SVM and ANN are 
adequate for the prediction of extreme events and are able to explain more than 60% of the 
computed SPI. 


6. Conclusion 


Water resource and environmental engineers need accurate information in harnessing water for 
diverse uses, therefore it expedient to predict extreme events in order to ensure water resource 
planning and management. This study assessed the dimensions of short-term dry and wet phases 
using SPI as well as predicting those using machine-learning techniques (SVM and ANN). 


Based on the result of the study, the following key insights drawn from the study; 


i. Rainfalls were generally low between Novermber and March and were high between April 
and October; this pattern was found to be exhibited by relative humidity. While other 
parameters were found to exhibit a reverse pattern, except wind speed; 

ii. extreme wet conditions were experienced in 1962, 1966, 1983, 1986, 1988, 1991, 2003, 
2007, 2010, 2014 and 2015, while extreme dry conditions were experienced in 1976, 1985, 
1987, 1989, 1992, 2002, 2004, 2008, 2011, 2012; 

ili. while normal meteorological conditions occurred for about 74.8% of the study period, 
8.9%, 4.6% and 1.9% of this period were moderately wet, severely wet and extremely wet 
respectively, and 4.7%, 3.4% and 1.7% of the study period were moderately dry, severely 
dry and extremely dry respectively; 

iv. SVM with radial kernel performed better than SVM with Linear, Sigmol and Polynomial 
kernels; while ANN with two hidden layers performed better than ANN with a single 
hidden layer. 

v. SVM with radial kernel performed best in predicting calibration or training data, while 
ANN performed best in Validation or Test data set. 


Although, both SVM and ANN showed good predictability; future studies may focus on 
improving the performance of these models. This study serves as a baseline for the advancement 
of the application of machine-learning techniques in water resources management studies in 
Nigeria. 
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